# ResT V2: Simpler, Faster and Stronger

Official PyTorch implementation of **ResTv2**, from the following paper:

ResT V2: Simpler, Faster and Stronger


--- 

<p align="center">
<img src="figures/fig_1.png" width=50% height=50% 
class="center">
</p>

We propose **ResTv2**, a simpler, faster, and stronger multi-scale vision
Transformer for visual recognition.

## Catalog
- [x] ImageNet-1K Training Code
- [x] ImageNet-1K Fine-tuning Code  
- [x] Downstream Transfer (Detection) Code

<!-- ✅ ⬜️  -->

## Results and Pre-trained Models
### ImageNet-1K trained models

|    name    | resolution |acc@1 | #params | FLOPs | Throughput | model |
|:----------:|:---:|:---:|:---:| :---:|:---:|:---:|
|  ResTv2-T  | 224x224 | 82.3 | 30M | 4.1G  | 826 | - |
|  ResTv2-T  | 384x384 | 83.7 | 30M | 12.7G | 319 | - |
|  ResTv2-S  | 224x224 | 83.2 | 41M | 6.0G  | 687 | - |
|  ResTv2-S  | 384x384 | 84.5 | 41M | 18.4G | 256 | - |
|  ResTv2-B  | 224x224 | 83.7 | 56M | 7.9G  | 582 | - |
|  ResTv2-B  | 384x384 | 85.1 | 56M | 24.3G | 210 | - |
|  ResTv2-L  | 224x224 | 84.2 | 87M | 13.8G | 415 | - |
|  ResTv2-L  | 384x384 | 85.4 | 87M | 42.4G | 141 | - |

## Installation
Please check [INSTALL.md](INSTALL.md) for installation instructions. 

## Evaluation
We give an example evaluation command for a ImageNet-1K pre-trained, then ImageNet-1K fine-tuned ResTv2-T:

Single-GPU
```
python main.py --model restv2_base --eval true \
--resume restv2_tiny_384.pth \
--input_size 384 --drop_path 0.1 \
--data_path /path/to/imagenet-1k
```

This should give 
```
* Acc@1 83.708 Acc@5 96.524 loss 0.777
```

- For evaluating other model variants, change `--model`, `--resume`, `--input_size` accordingly. You can get the url to pre-trained models from the tables above. 
- Setting model-specific `--drop_path` is not strictly required in evaluation, as the `DropPath` module in timm behaves the same during evaluation; but it is required in training. See [TRAINING.md](TRAINING.md) or our paper for the values used for different models.

## Training
See [TRAINING.md](TRAINING.md) for training and fine-tuning instructions.

## Acknowledgement
This repository is built using the [timm](https://github.com/rwightman/pytorch-image-models) library.

## License
This project is released under the Apache License 2.0. Please see the [LICENSE](LICENSE) file for more information.

## Citation
If you find this repository helpful, please consider citing:
```
@Article{xxx2022restv2,
  author  = {xxxx},
  title   = {ResT V2: Simpler, Faster and Stronger},
  journal = {arXiv preprint arXiv:xxxxx},
  year    = {2022},
}
```
